AITopics | center vector

Collaborating Authors

center vector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Using predefined vector systems as latent space configuration for neural network supervised training on data with arbitrarily large number of classes

Gabdullin, Nikita

arXiv.org Artificial IntelligenceOct-7-2025

Supervised learning (SL) methods are indispensable for neural network (NN) training used to perform classification tasks. While resulting in very high accuracy, SL training often requires making NN parameter number dependent on the number of classes, limiting their applicability when the number of classes is extremely large or unknown in advance. In this paper we propose a methodology that allows one to train the same NN architecture regardless of the number of classes. This is achieved by using predefined vector systems as the target latent space configuration (LSC) during NN training. We discuss the desired properties of target configurations and choose randomly perturbed vectors of An root system for our experiments. These vectors are used to successfully train encoders and visual transformers (ViT) on Cinic-10 and ImageNet-1K in low- and high-dimensional cases by matching NN predictions with the predefined vectors. Finally, ViT is trained on a dataset with 1.28 million classes illustrating the applicability of the method to training on datasets with extremely large number of classes. In addition, potential applications of LSC in lifelong learning and NN distillation are discussed illustrating versatility of the proposed methodology.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

2510.0409

Country: North America > United States (0.46)

Genre: Research Report (0.67)

Industry:

Education (0.66)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Common Stability Mechanism behind most Self-Supervised Learning Approaches

Jha, Abhishek, Blaschko, Matthew B., Asano, Yuki M., Tuytelaars, Tinne

arXiv.org Artificial IntelligenceFeb-22-2024

Last couple of years have witnessed a tremendous progress in self-supervised learning (SSL), the success of which can be attributed to the introduction of useful inductive biases in the learning process to learn meaningful visual representations while avoiding collapse. These inductive biases and constraints manifest themselves in the form of different optimization formulations in the SSL techniques, e.g. by utilizing negative examples in a contrastive formulation, or exponential moving average and predictor in BYOL and SimSiam. In this paper, we provide a framework to explain the stability mechanism of these different SSL techniques: i) we discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO; ii) we provide an argument that despite different formulations these methods implicitly optimize a similar objective function, i.e. minimizing the magnitude of the expected representation over all data samples, or the mean of the data distribution, while maximizing the magnitude of the expected representation of individual samples over different data augmentations; iii) we provide mathematical and empirical evidence to support our framework. We formulate different hypotheses and test them using the Imagenet100 dataset.

center vector, representation, simsiam, (14 more...)

arXiv.org Artificial Intelligence

2402.14957

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Generalized Gaussian Kernel Adaptive Filtering

Wada, Tomoya, Fukumori, Kosuke, Tanaka, Toshihisa, Fiori, Simone

arXiv.org Machine LearningApr-25-2018

The present paper proposes generalized Gaussian kernel adaptive filtering, where the kernel parameters are adaptive and data-driven. The Gaussian kernel is parametrized by a center vector and a symmetric positive definite (SPD) precision matrix, which is regarded as a generalization of the scalar width parameter. These parameters are adaptively updated on the basis of a proposed least-square-type rule to minimize the estimation error. The main contribution of this paper is to establish update rules for precision matrices on the SPD manifold in order to keep their symmetric positive-definiteness. Different from conventional kernel adaptive filters, the proposed regressor is a superposition of Gaussian kernels with all different parameters, which makes such regressor more flexible. The kernel adaptive filtering algorithm is established together with a l1-regularized least squares to avoid overfitting and the increase of dimensionality of the dictionary. Experimental results confirm the validity of the proposed method.

algorithm, kernel adaptive, precision matrix, (15 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0237654

1804.09348

Country:

North America > United States (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Normalized center loss for language modeling – Towards Data Science – Medium

@machinelearnbotOct-10-2017, 19:25:08 GMT

Caveat: Some knowledge of recurrent neural networks is assumed. In Language Modeling we try to predict the next word given a sequence of words. The machine learning model computes probabilities of the possible values for the next word. And a word is sampled from the generated probability distribution. You build a recurrent neural network architecture.

artificial intelligence, center loss, machine learning, (7 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback